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DETAILED ACTION 



Response to Arguments 



1 . Applicant's arguments, see Decision by Board of Patent and Interferences, filed 
4/20/05, with respect to the rejection(s)of clalm(s) 1-37 under 35 U.S.C. 102(e) with US 
6,046,745 have been fully considered and are persuasive. Therefore, the rejection has 
been withdrawn. However, upon further consideration, a new ground(s) of rejection is 
made in view of Jain et al (5,729,471 ) and Jain et al in view of Lee (5,612,743). 

Claim Rejections - 35 USC § 102 

2. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 



3. Claims 1-16, 18, 19, 21, 22, 36 and 37 are rejected under 35 U.S.C. 102(b) as 
being anticiapted by Jain et al (5,729,471). 

Regarding claim 1 , Jain discloses a method of recovering a three-dimensional 
scene from two-dimensional images, the method comprising: 

providing a sequence of images (fig. 12, note camera 1 obtain video images in 
two-dimensional form; also see fig.8, note camera 1 obtains a sequence of two- 
dimensional images, and cameras 2 and 3 also obtain a corresponding sequence of 
images; col.22. In. 1-3); 



states. 
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dividing the sequence of images into segments (fig.8, note that camera 1 obtains 
a sequence of 412 frames for approximately 13 seconds, and that every 30 frames 
obtained for each second, ie. the standard NTSC frame rate (30 frames/sec), can be 
considered a segment, so in this case, camera 1 has approximately 14 segments, thus, 
Jain discloses the division of the sequence of images into segments; also, in col. 23, 
ln.58 to col.24, ln.3; Jain discloses the extraction of key frames by selecting one key 
frame from every 30 frames, ie. a^segment of a sequence of frames, clearly, Jain 
discloses there are segments within a sequence of frames, otherwise, the 
ascertainment of key frames would not be possible without these segments, where each 
segment is formed from a sequence of 30 frames); 

performing three-dimensional reconstruction for each segment individually 
(fig. 12, note there are multiple "image to ground projection" sections that are used to 
calculate and project an image or a partial model for each segment of that includes 
three-dimensional occupancy estimation for which a 3D map of is generated In an 
attempt to form a dynamic model; col .21 , ln.63 to col.22, ln.7, Jain discloses the 
obtaining of the feature points within the frames; col.22, ln.62 to col.23, ln.56, Jain 
discloses the use of equations that includes three dimensional coordinates (x, y, z) that 
includes camera position or pose, camera angle and camera parameter to obtain a 
partial model or a "image to ground projection"); and 

combining the three-dimensional reconstructed segments together to recover a 
three-dimensional scene for the sequence of images (fig. 12, note the "3D visualization" 
section is the product of the adjusting of the virtual key frames to produce a complete 
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three-dimensional reconstruction of the two dimensional frames obtained by video 
camera 1 to video camera N; also, col.24, ln.38-67, Jain discloses the key frames are 
used to obtain the best possible three-dimensional reconstruction of the two- 
dimensional frame data in that if there is not enough known points from key frames, 
estimates or bundle adjustments were made to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization). 

Regarding claim 2, Jain discloses the use of virtual key frames (col .23, In. 58 to 
col.24, ln.3; Jain discloses the extraction of key frames by selecting one key frame from 
every 30 frames, ie. a segment of a sequence of frames). 

Regarding claim 3, Jain discloses the use of feature points in image data (fig. 12, 
note there are multiple "image to ground projection" sections that are used to calculate 
and project an image or a partial model for each segment of that includes three- 
dimensional occupancy estimation for which a 3D map of is generated in an attempt to 
form a dynamic model; col .21, ln.63 to col.22, ln.7, Jain discloses the obtaining of the 
feature points within the frames; col.22, ln.62 to col.23, ln.56, Jain discloses the use of 
equations that includes three dimensional coordinates (x, y, z) that includes camera 
position or pose, camera angle and camera parameter to obtain a partial model or a 
"image to ground projection"). 

Regarding claims 4 and 13-16, Jain discloses the performance of a two-frame 
structure from motion algorithm on each of the segments to create a partial model 
(fig. 12, note there are multiple "image to ground projection" sections that are used to 
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calculate and project an image or a partial model for each segment of that includes 
three-dimensional occupancy estimation for which a 3D map of is generated in an 
attempt to form a dynamic model; col.21, ln.63 to col.22, ln.7, Jain discloses the 
obtaining of the feature points within the frames; col,22, ln.62 to col.23, ln.56, Jain 
discloses the use of equations that includes three dimensional coordinates (x, y, z) that 
includes camera position or pose, camera angle and camera parameter to obtain a 
partial model or a "image to ground projection"); and eliminating ambiguity (col. 24, 
ln.38-67, Jain discloses the virtual key frames are used to obtain the best possible 
three-dimensional reconstruction of the two-dimensional frame data in that if there is not 
enough known points, ie. uncertainty, from virtual keyframes, thus, segmented frames 
are encoded into at least two virtual key frames to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization). 

Regarding claims 5, 7 and 18, Jain discloses extracting virtual keyframes 
(col.23, ln.58 to col .24, ln.3; Jain discloses the extraction of key frames by selecting one 
key frame from every 30 frames in that every 30 frames can be considered a segment 
of a sequence of frames; also, col.24, ln.38-67, Jain discloses the keyframes are used 
to obtain the best possible three-dimensional reconstruction of the two-dimensional 
frame data in that if there is not enough known points, ie. uncertainty, from key frames, 
estimates or bundle adjustments were made to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization) and bundle adjustment of key frames (fig. 12, note the "3D visualization" 
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section is the product of the adjusting of the virtual key frames to produce a complete 
three-dimensional reconstruction of the two dimensional frames obtained by video 
camera 1 to video camera N; also, col.24, ln.38-67, Jain discloses the keyframes are 
used to obtain the best possible three-dimensional reconstruction of the two- 
dimensional frame data in that if there is not enough known points from key frames, 
estimates or bundle adjustments were made to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization). 

Regarding claims 6 and 12, Jain discloses identify feature points, estimating 
three dimensional coordinates, and estimating camera rotation and translation (fig. 12, 
note there are multiple "image to ground projection" sections that are used to calculate 
and project an image or a partial model for each segment of that includes three- 
dimensional occupancy estimation for which a 3D map of is generated in an attempt to 
form a dynamic model; col .21, ln.63 to col.22, ln.7, Jain discloses the obtaining of the 
feature points within the frames; col.22, ln.62 to col.23, ln.56, Jain discloses the use of 
equations that includes three dimensional coordinates (x, y, z) that includes camera 
position or pose, camera angle and camera parameter to obtain a partial model or a 
"image to ground projection"). 

Regarding claim 8, Jain discloses the use of a computer-readable medium to 
execute instructions for performing the method of claim 1 (col. 15, ln.65-67). 

Regarding claims 9 and 21 , discloses a method of recovering a three- 
dimensional scene from two-dimensional images, the method comprising: 
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identifying a sequence of two-dimensional frames that include two-dimensional 
images (fig. 12, note camera 1 obtain video images in two-dimensional form; also see 
fig.8, note camera 1 obtains a sequence of two-dimensional images, and cameras 2 
and 3 also obtain a corresponding sequence of images; col .22, In. 1-3); 

dividing the sequence of images into segments, wherein a segment includes a 
plurality of frames (fig.8, note that camera 1 obtains a sequence of 412 frames for 
approximately 13 seconds, and that every 30 frames obtained for each second, ie. the 
standard NTSC frame rate (30 frames/sec), can be considered a segment, so in this 
case, camera 1 has approximately 14 segments, thus, Jain discloses the division of the 
sequence of images into segments; also, in col,23. In. 58 to col.24, ln.3; Jain discloses 
the extraction of key frames by selecting one key frame from every 30 frames, ie. a 
segment of a sequence of frames, clearly, Jain discloses there are segments within a 
sequence of frames, otherwise, the ascertainment of key frames would not be possible 
without these segments, where each segment is formed from a sequence of 30 frames); 

for each segment, encoding the frames in the segment into at least two virtual 
frames that include a three-dimensional structure for the segment and an uncertainty 
associated with the segment (col .23, ln.58 to col.24, ln.3; Jain discloses the extraction 
of virtual keyframes by selecting one key frame from every 30 frames in that every 30 
frames can be considered a segment of a sequence of frames; also, col.24, ln.38-67, 
Jain discloses the virtual key frames are used to obtain the best possible three- 
dimensional reconstruction of the two-dimensional frame data in that if there is not 
enough known points, ie. uncertainty, from virtual key frames, thus, segmented frames 
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are encoded into at least two virtual key frannes to ascertain the best, possible three- 
dinnensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization). 

Regarding claim 10, Jain discloses the identifying the base frame, identifying the 
feature points in the base frame, and defining the segments (col.21, ln.63 to col.22. In. 7; 
Jain discloses the identification of feature points in the plural frames that includes the 
first base frame in the segments from the sequence of images). 

Regarding claim 1 1 , Jain discloses the variation of segments and variation of 
frames (fig.8, note camera 1 has multiple 41 3 frames in approximately 13 seconds, 
where each segment has 30 frames to obtain approximately 13 segments from camera 
1 , whereas camera 2 has 181 frames in 6 seconds, or approximately 6 segments from 
camera 2, etc.). 

Regarding claim 19, Jain discloses performing motion estimation to identify 
feature points (col.21, ln.63 to col.22, ln.7). 

Regarding claim 22, Jain discloses the use of a computer-readable medium to 
execute instructions for performing the method of claim 9 (col. 15, ln.65-67). 

Regarding claim 36, Jain discloses a computer-readable medium having 
computer-executable instructions for performing a method comprising: 

providing a sequence of two-dimensional frames (fig. 12, note camera 1 obtain 
video images in two-dimensional form; also see fig.8, note camera 1 obtains a 
sequence of two-dimensional images, and cameras 2 and 3 also obtain a corresponding 
sequence of images; col.22. In. 1-3); 
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dividing the sequence into segments (fig.8, note that camera 1 obtains a 
sequence of 412 frames for approximately 13 seconds, and that every 30 frames 
obtained for each second, ie. the standard NTSC frame rate (30 frames/sec), can be 
considered a segment, so in this case, camera 1 has approximately 14 segments, thus, 
Jain discloses the division of the sequence of images into segments; also, in col .23, 
In. 58 to col.24, ln.3; Jain discloses the extraction of key frames by selecting one key 
frame from every 30 frames, ie. a segment of a sequence of frames, clearly, Jain 
discloses there are segments within a sequence of frames, otherwise, the 
ascertainment of key frames would not be possible without these segments, where each 
segment is formed from a sequence of 30 frames); 

calculating a partial model for each segment that includes three-dimensional 
coordinates and camera pose for features within the frames (fig. 12, note there are 
multiple "image to ground projection" sections that are used to calculate and project an 
image or a partial mode! for each segment of that includes three-dimensional 
occupancy estimation for which a 3D map of is generated in an attempt to form a 
dynamic model; col.21, ln.63 to col.22, ln.7, Jain discloses the obtaining of the feature 
points within the frames; col.22. In. 62 to col. 23, In. 56, Jain discloses the use of 
equations that includes three dimensional coordinates (x, y, z) that includes camera 
position or pose, camera angle and camera parameter to obtain a partial model or a 
"image to ground projection"); 

extracting virtual keyframes from each partial model, the virtual keyframes 
having three-dimensional coordinates for the frames and an uncertainty associated with 



# 
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the frames (col .23, ln.58 to col .24, ln.3; Jain discloses the extraction of key frames by 
selecting one key frame from every 30 frames in that every 30 frames can be 
considered a segment of a sequence of frames; also, col.24, ln.38-67, Jain discloses 
the key frames are used to obtain the best possible three-dimensional reconstruction of 
the two-dimensional frame data in that if there is not enough known points, ie. 
uncertainty, from key frames, estimates or bundle adjustments were made to ascertain 
the best, possible three-dimensional reconstruction of the two-dimensional frame data 
to yield the 3D visualization); and 

bundle adjusting the virtual key frames to obtain a complete three-dimensional 
reconstruction of the two-dimensional frames (fig. 12, note the "3D visualization" section 
is the product of the adjusting of the virtual key frames to produce a complete three- 
dimensional reconstruction of the two dimensional frames obtained by video camera 1 
to video camera N; also, col.24, ln.38-67, Jain discloses the keyframes are used to 
obtain the best possible three-dimensional reconstruction of the two-dimensional frame 
data in that if there is not enough known points from key frames, estimates or bundle 
adjustments were made to ascertain the best, possible three-dimensional reconstruction 
of the two-dimensional frame data to yield the 3D visualization). 

Regarding claim 37, Jain discloses an apparatus for recovering a three- 
dimensional scene from a sequence of two-dimensional frames by segmenting the 
frames, comprising: 

means for capturing two-dimensional images (fig. 12, note camera 1 obtain video 
images in two-dimensional form; also see fig.8, note camera 1 obtains a sequence of 
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two-dimensional images, and cameras 2 and 3 also obtain a corresponding sequence of 
images; col.22, In. 1-3); 

means for dividing the sequence into segments (fig.8, note that camera 1 obtains 
a sequence of 412 frames for approximately 13 seconds, and that every 30 frames 
obtained for each second, ie. the standard NTSC frame rate (30 frames/sec), can be 
considered a segment, so in this case, camera 1 has approximately 14 segments, thus, 
Jain discloses the division of the sequence of images into segments; also, in col.23, 
ln.58 to col.24, In. 3; Jain discloses the extraction of key frames by selecting one key 
frame from every 30 frames, ie. a segment of a sequence of frames, clearly, Jain 
discloses there are segments within a sequence of frames, othenwise, the 
ascertainment of key frames would not be possible without these segments, where each 
segment is formed from a sequence of 30 frames); 

means for calculating a partial model for each segment that includes three- 
dimensional coordinates and camera pose for features within the frames (fig. 12, note 
there are multiple "image to ground projection" sections that are used to calculate and 
project an image or a partial model for each segment of that includes three-dimensional 
occupancy estimation for which a 3D nnap of is generated in an attempt to form a 
dynamic model; col .21 , ln.63 to col.22, ln.7, Jain discloses the obtaining of the feature 
points within the frames; coi.22, ln.62 to col.23, ln.56, Jain discloses the use of 
equations that includes three dimensional coordinates (x, y, z) that includes camera 
position or pose, camera angle and camera parameter to obtain a partial model or a 
"image to ground projection"); 
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means for extracting virtual key frames from each partial model (col.23, ln.58 to 
col .24, In. 3; Jain discloses the extraction of key frames by selecting one keyframe from 
every 30 frames in that every 30 frames can be considered a segment of a sequence of 
frames); and 

means for bundle adjusting the virtual key frames to obtain a complete three- 
dimensional reconstruction of the two-dimensional frames (fig. 12, note the "3D 
visualization" section is the product of the adjusting of the virtual key frames to produce 
a complete three-dimensional reconstruction of the two dimensional frames obtained by 
video camera 1 to video camera N; also, col.24, ln.38-67, Jain discloses the key frames 
are used to obtain the best possible three-dimensional reconstruction of the two- 
dimensional frame data in that if there is not enough known points from key frames, 
estimates or bundle adjustments were made to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization). 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 17, 20 and 23-30 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Jain et al (5,729,471) in view of Lee (5,612,743). 
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Regarding claims 17, 23, 24 and 28, Jain discloses a method of recovering a 
three-dimensional scene from a sequence of two-dimensional frames, comprising: 

segmenting the sequence of two dimensional frames (fig. 12, note camera 1 
obtain video images in two-dimensional form; also see fig.8, note camera 1 obtains a 
sequence of two-dimensional images, and cameras 2 and 3 also obtain a corresponding 
sequence of images; see col.22. In. 1-3; fig.8, note that camera 1 obtains a sequence of 
412 frames for approximately 13 seconds, and that every 30 frames obtained for each 
second, ie. the standard NTSC frame rate (30 frames/sec), can be considered a 
segment, so in this case, camera 1 has approximately 14 segments, thus, Jain 
discloses the division of the sequence of images into segments; also, in col .23, In. 58 to 
col .24, In. 3; Jain discloses the extraction of key frames by selecting one key frame from 
every 30 frames, ie. a segment of a sequence of frames, clearly, Jain discloses there 
are segments within a sequence of frames, otherwise, the ascertainment of key frames 
would not be possible without these segments, where each segment is formed from a 
sequence of 30 frames); 

identifying feature points in at least a first base frame in a first segment (col. 21 , 
ln.63 to col.22, ln.7; Jain discloses the identification of feature points in the plural frames 
that includes the first base frame); and 

analyzing a second frame in the segment to identify the feature points in the 
second frame (col.21, ln.63 to col.22, ln.7; Jain discloses the identification of feature 
points in each frame from a plurality of frames that includes the second frame). 
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Jain does not specifically disclose the adding the second frame to the segment. 
However, Jain discloses the manual adjustment of the number of key frames, where the 
number is one key frame for every thirty frames, ie. a segment (col.23, ln.64 to col.24, 
ln.3). Therefore, since Jain teaches the manual adjustment of one key frame or 
representative frame for every thirty frames, it would have been obvious to one of 
ordinary skill in the art to manually change the number of key (representative) frames 
per segment from anywhere between two to five key or representative frames per 
segment if necessary for accurately enhancing the three-dimensional representation of 
the targeted scene. 

Jain does not specifically disclose the determining whether a threshold number of 
feature points from base frame are identified in the second frame; if a threshold number 
of feature points are identified in the second frame, adding the second frame to the 
segment; and repeating the analyzing step, determining step and adding step for 
subsequent frames until the number of feature points in a frame falls below the 
threshold number. However, Lee teaches the determining whether a threshold number 
of feature points from base frame are identified in the second frame (col.2, ln.65 to 
col.3, ln.31 ; Lee teaches the use of threshold values TH and comparison of threshold 
values of feature points between the current frame and the reference frame to check if 
the threshold is exceeded); if a threshold number of feature points are identified in the 
second frame, adding the second frame to the segment (col.2, ln.65 to col.3, ln.31); and 
repeating the analyzing step, determining step and adding step for subsequent frames 
until the number of feature points in a frame falls below the threshold number (fig.3, 
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note Lee discloses the process is cyclical and repetitive, thus the analysis, 
determination and addition steps are repeated). Therefore, it would have been obvious 
to one of ordinary skill in the art to combine the teachings of Jain and Lee, as a whole, 
for improving the encoding of video image data so as to accurately encode images via 
the selection of feature points according to the motion of objects in a financially robust 
manner (col.2, ln.60-64). 

Regarding claim 20, Jain does not specifically disclose creating a template block 
in a first frame, creating a search window used in the second frame, and comparing an 
intensity difference between the search window and the template block to locate the 
feature point in the second frame. However, Lee teaches that creating a template block 
in a first frame, creating a search window used in the second frame, and comparing an 
intensity difference between the search window and the template block to locate the 
feature point in the second frame (fig.4, note frame A and frame B are the first and 
second frames, note fig.3, element 313 also discloses the comparison process to 
compare differences to determine or locate the feature point in the second frame). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
teachings of Jain and Lee, as a whole, for improving the encoding of video image data 
so as to accurately encode images via the selection of feature points according to the 
motion of objects in a financially robust manner (col.2, ln.60-64). 

Regarding claim 25, Jain discloses performing motion estimation to identify 
feature points (col.21, ln.63 to col.22, ln.7). 
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Regarding claim 26, Jain discloses the identification of corners as feature points 
(col.22, In. 15-22; note the disclosure of borders, hashlines, marks are feature points to 
create corners as to determine camera status and pose). 

Regarding claim 27, Jain discloses the number of frames can vary between 
segments (col.23, ln.64 to col.24, ln.3). 

Regarding claim 29, Jain discloses the bundle adjustment of key frames (fig. 12, 
note the "3D visualization" section is the product of the adjusting of the virtual key 
frames to produce a complete three-dimensional reconstruction of the two dimensional 
frames obtained by video camera 1 to video camera N; also, col.24. In. 38-67, Jain 
discloses the keyframes are used to obtain the best possible three-dimensional 
reconstruction of the two-dimensional frame data in that if there is not enough known 
points from keyframes, estimates or bundle adjustments were made to ascertain the 
best, possible three-dimensional reconstruction of the two-dimensional frame data to 
yield the 3D visualization). 

Regarding claim 30, Jain discloses the use of a computer-readable medium to 
execute instructions for performing the method of claim 23 (col. 15, ln.65-67). 

Claim 31-35 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Jainet al (5,729,471). 

Regarding claim 31, Jain discloses a method of recovering a three-dimensional 
scene from a sequence of two-dimensional frames (fig. 12), an improvement comprising: 

dividing a long sequence of frames into segments (fig. 8, note that camera 1 
obtains a sequence of 412 frames for approximately 13 seconds, and that every 30 
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frames obtained for each second, ie. the standard NTSC frame rate (30 frames/sec), 
can be considered a segment, so in this case, camera 1 has approximately 14 
segments, thus, Jain discloses the division of the sequence of images into segments; 
also, in col .23, ln.58 to col.24, ln.3; Jain discloses the extraction of key frames by 
selecting one key frame from every 30 frames, ie. a segment of a sequence of frames, 
clearly, Jain discloses there are segments within a sequence of frames, otherwise, the 
ascertainment of key frames would not be possible without these segments, where each 
segment is formed from a sequence of 30 frames), 

wherein the representative frames are used to recover the three-dimensional 
scene and remaining frames are discarded so that three-dimensional scene is 
effectively compressed (col,23, ln.58 to col.24, ln.3; Jain discloses the extraction of 
virtual key frames by selecting one key frame from every 30 frames in that every 30 
frames can be considered a segment of a sequence of frames; also, col.24, In. 38-67, 
Jain discloses the virtual keyframes are used to obtain the best possible three- 
dimensional reconstruction of the two-dimensional frame data in that if there is not 
enough known points, ie. uncertainty, from virtual keyframes, thus, segmented frames 
are encoded into at least two virtual key frames to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization, the excess remaining frames are discarded). 

Jain does not specifically disclose the reducing the number of frames in each 
segment by representing the segments using between two and five representative 
frames per segment. However, Jain discloses the manual adjustment of the number of 
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key frames, where the number is one key frame for every thirty frames, ie. a segment 
(col.23, ln.64 to col.24, ln.3). Therefore, since Jain teaches the manual adjustment of 
one key frame or representative frame for every thirty frames, it would have been 
obvious to one of ordinary skill in the art to manually change the number of key 
(representative) frames per segment from anywhere between two to five key or 
representative frames per segment if necessary for accurately enhancing the three- 
dimensional representation of the targeted scene. 

Regarding claim 32, Jain discloses that each representative frame have an 
associated uncertainty (col.24, ln.38-67, Jain discloses the keyframes are used to 
obtain the best possible three-dimensional reconstruction of the two-dimensional frame 
data in that if there is not enough known points, ie. uncertainty, from key frames, 
estimates or bundle adjustments were made to ascertain the best, possible three- 
dimensional reconstruction of the two-dimensional frame data to yield the 3D 
visualization). 

Regarding claim 33, Jain discloses the long sequence of frames includes over 75 
frames (fig.8, note that camera 1 obtains a sequence of 412 frames, which clearly is 
over 75 frames). 

Regarding claim 34, Jain discloses the division of the long sequence into 
segments and tracking feature points (fig.8, note that camera 1 obtains a sequence of 
412 frames for approximately 13 seconds, and that every 30 frames obtained for each 
second, ie. the standard NTSC frame rate (30 frames/sec), can be considered a 
segment, so in this case, camera 1 has approximately 14 segments, thus, Jain 
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discloses the division of the sequence of images into segments; also, in col.23, ln.58 to 
col .24, ln.3; Jain discloses the extraction of key frames by selecting one key frame from 
every 30 frames, ie. a segment of a sequence of frames, clearly, Jain discloses there 
are segments within a sequence of frames, otherwise, the ascertainment of key frames 
would not be possible without these segments, where each segment is formed from a 
sequence of 30 frames; col.21 , In. 63 to coL22, ln.7, Jain discloses the obtaining of the 
feature points within the frames). 

Regarding claim 35, Jain discloses the performance of a two-frame structure 
from motion algorithm on each of the segments to create a partial model (fig. 12, note 
there are multiple "image to ground projection" sections that are used to calculate and 
project an image or a partial model for each segment of that includes three-dimensional 
occupancy estimation for which a 3D map of is generated in an attempt to form a 
dynamic model; col.21 , ln.63 to col .22, ln.7, Jain discloses the obtaining of the feature 
points within the frames; col .22, In. 62 to col.23, ln.56, Jain discloses the use of 
equations that includes three dimensional coordinates (x, y, z) that includes camera 
position or pose, camera angle and camera parameter to obtain a partial model or a 
"image to ground projection"). 

Contact Information 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Allen Wong whose telephone number is (571) 272-7341 . 
The examiner can normally be reached on Mondays to Thursdays from 8am-6pm 
Flextime. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Mehrdad Dastouri can be reached on (571) 272-7418. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 
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you have questions on access to the Private PAIR system, contact the Electronic 
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